NACSIS Test Collection Workshop (NTCIR-1)

نویسندگان

  • Noriko Kando
  • Kazuko Kuriyama
  • Toshihiko Nozue
چکیده

The test collection used in the Workshop consists of more than 330,000 documents and more than half are English-Japanese paired. Although there is a Japanese test collection called BMIRJ2 consisting of 5,080 newspaper articles[2], enhancement of the Japanese test collection in the both aspects of the variety of text types and the scale is needed. We put emphasis on cross-lingual retrieval since it is critical in the internet environment and Japanese scientific information retrieval [3].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asian Language Parsing Evaluated by Hummingbird SearchServerTMat NTCIR-3

Hummingbird submitted ranked result sets for the Chinese, Japanese and Korean Single Language Information Retrieval tracks of the Cross-Language Retrieval Task of the 3rd NII-NACSIS Test Collection for IR Systems Workshop (NTCIR-3). SearchServer 5.3’s segmenter for Asian text, compared to an overlapping n-gram approach, was found to modestly increase precision scores for Japanese, to have a neu...

متن کامل

CJK Experiments with Hummingbird SearchServerTM at NTCIR-5

Hummingbird submitted ranked result sets for the Chinese, Japanese and Korean Single Language Information Retrieval subtasks of the Cross-Lingual Information Retrieval Task of the 5th NII-NACSIS Test Collection for IR Systems Workshop (NTCIR-5). For short Chinese (title) queries, a decompounded wordbased approach produced higher (statistically significant) mean average precision and first relev...

متن کامل

NTCIR CLIR Experiments at the University of Maryland

This paper presents results for the Japanese/English cross-language information retrieval task on the NACSIS Test Collection. Two automatic dictionarybased query translation techniques were tried with four variants of the queries. The results indicate that longer queries outperform the required descriptiononly queries and that use of the rst translation in the dictionary is comparable with the ...

متن کامل

Evaluation -- the Way Ahead : A Case of the NTCIR

Noriko Kando National Institute of Informatics (NII), Tokyo [email protected] Abstract: This paper introduces activities of the cross-lingual information retrieval (CLIR) systems evaluation in the NTCIR (NII-NACSIS Test Collection for Information Retrieval and Text Processing Technologies) project and suggests several axes as a framework describing the nature of CLIR experiments. Finally it menti...

متن کامل

The Very Large Collection and Web Tracks (Preprint version)

Together, the TREC Very Large Collection (VLC) Track and its successor the Web Track have run for seven years, after an initial VLC pre-track. During that time five new test collections have been created, five different types of retrieval task have been studied, a large number of important issues have been addressed, and new methods have been tried, not only for retrieval, but also for test col...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007